122 research outputs found

    Author Identifiers in Scholarly Repositories

    Get PDF
    Bibliometric and usage-based analyses and tools highlight the value of information about scholarship contained within the network of authors, articles and usage data. Less progress has been made on populating and using the author side of this network than the article side, in part because of the difficulty of unambiguously identifying authors. I briefly review a sample of author identifier schemes, and consider use in scholarly repositories. I then describe preliminary work at arXiv to implement public author identifiers, services based on them, and plans to make this information useful beyond the boundaries of arXiv.Comment: 10 pages. Based on a presentation given at Open Repositories 200

    Eprints and the Open Archives Initiative

    Full text link
    The Open Archives Initiative (OAI) was created as a practical way to promote interoperability between eprint repositories. Although the scope of the OAI has been broadened, eprint repositories still represent a significant fraction of OAI data providers. In this article I present a brief survey of OAI eprint repositories, and of services using metadata harvested from eprint repositories using the OAI protocol for metadata harvesting (OAI-PMH). I then discuss several situations where metadata harvesting may be used to further improve the utility of eprint archives as a component of the scholarly communication infrastructure.Comment: 13 page

    Exposing and harvesting metadata using the OAI metadata harvesting protocol: A tutorial

    Get PDF
    In this article I outline the ideas behind the Open Archives Initiative metadata harvesting protocol (OAIMH), and attempt to clarify some common misconceptions. I then consider how the OAIMH protocol can be used to expose and harvest metadata. Perl code examples are given as practical illustration.Comment: 13 pages, 1 figure. Example programs included (download source). HEPLW version (HTML) available online at http://library.cern.ch/HEPLW/4/papers/3

    Author identifiers: 1) Services at arXiv and 2) ORCID and repositories

    Get PDF
    I will present two separate but related topics where experience with the first provides much of my perspective with the second. Public author identifiers and services based on them were introduced in March 2009 and early work and design was reported at OR09. The original services have been running for a year now and additional facilities have been added. I will report and uptake and usage patterns, and describe the more popular services. ORCID is an exciting initiative involving both commercial and academic participants that aims to build a registry and assign identifiers to address the author ambiguity problem. I will report on the current status of this rapidly evolving project and suggest how the repository community may contribute to and benefit from it

    Author identifiers: 1) Services at arXiv and 2) ORCID and repositories

    Get PDF
    I will present two separate but related topics where experience with the first provides much of my perspective with the second. Public author identifiers and services based on them were introduced in March 2009 and early work and design was reported at OR09. The original services have been running for a year now and additional facilities have been added. I will report and uptake and usage patterns, and describe the more popular services. ORCID is an exciting initiative involving both commercial and academic participants that aims to build a registry and assign identifiers to address the author ambiguity problem. I will report on the current status of this rapidly evolving project and suggest how the repository community may contribute to and benefit from it

    Plagiarism Detection in arXiv

    Full text link
    We describe a large-scale application of methods for finding plagiarism in research document collections. The methods are applied to a collection of 284,834 documents collected by arXiv.org over a 14 year period, covering a few different research disciplines. The methodology efficiently detects a variety of problematic author behaviors, and heuristics are developed to reduce the number of false positives. The methods are also efficient enough to implement as a real-time submission screen for a collection many times larger.Comment: Sixth International Conference on Data Mining (ICDM'06), Dec 200
    corecore